sample code
MCCoder: Streamlining Motion Control with LLM-Assisted Code Generation and Rigorous Verification
Li, Yin, Wang, Liangwei, Piao, Shiyuan, Yang, Boo-Ho, Li, Ziyue, Zeng, Wei, Tsung, Fugee
Large Language Models (LLMs) have shown considerable promise in code generation. However, the automation sector, especially in motion control, continues to rely heavily on manual programming due to the complexity of tasks and critical safety considerations. In this domain, incorrect code execution can pose risks to both machinery and personnel, necessitating specialized expertise. To address these challenges, we introduce MCCoder, an LLM-powered system designed to generate code that addresses complex motion control tasks, with integrated soft-motion data verification. MCCoder enhances code generation through multitask decomposition, hybrid retrieval-augmented generation (RAG), and self-correction with a private motion library. Moreover, it supports data verification by logging detailed trajectory data and providing simulations and plots, allowing users to assess the accuracy of the generated code and bolstering confidence in LLM-based programming. To ensure robust validation, we propose MCEVAL, an evaluation dataset with metrics tailored to motion control tasks of varying difficulties. Experiments indicate that MCCoder improves performance by 11.61% overall and by 66.12% on complex tasks in MCEVAL dataset compared with base models with naive RAG. This system and dataset aim to facilitate the application of code generation in automation settings with strict safety requirements. MCCoder is publicly available at https://github.com/MCCodeAI/MCCoder.
GitHub - oracle-samples/automlx: This repository contains demo notebooks (sample code) for the AutoMLx (automated machine learning and explainability) package from Oracle Labs.
This repository contains demo notebooks (sample code) for the AutoMLx (automated machine learning and explainability) package from Oracle Labs. The notebooks are intended to show how to initialize, train and explain an AutoML model in a few lines of code. The notebooks also cover many of the advanced features available in the AutoMLx package. Pre-executed copies of each of the demo notebooks are available as html files, which can be viewed without installing anything. The demo notebooks in this repository serve as supplementary documentation for the AutoMLx package.
Effective Testing for Machine Learning (Part II)
Since the mean of the target variable decreased, the regression problem got easier. Picture the distribution of a numerical variable: a model predicting zero will have an MAE equal to the absolute mean of the distribution; now, imagine you add recently generated data that increases the concentration of your target variable even more (i.e., the mean decreases): if you evaluate the model that always predicts zero, the MAE will decrease, giving the impression that your new model got better! After meeting with business stakeholders, we found out that a recent change in the data source introduced spurious observations.
Effective Testing for Machine Learning (Part II)
Since the mean of the target variable decreased, the regression problem got easier. Picture the distribution of a numerical variable: a model predicting zero will have an MAE equal to the absolute mean of the distribution; now, imagine you add recently generated data that increases the concentration of your target variable even more (i.e., the mean decreases): if you evaluate the model that always predicts zero, the MAE will decrease, giving the impression that your new model got better! After meeting with business stakeholders, we found out that a recent change in the data source introduced spurious observations.
Effective Testing for Machine Learning (Part II)
Since the mean of the target variable decreased, the regression problem got easier. Picture the distribution of a numerical variable: a model predicting zero will have an MAE equal to the absolute mean of the distribution; now, imagine you add recently generated data that increases the concentration of your target variable even more (i.e., the mean decreases): if you evaluate the model that always predicts zero, the MAE will decrease, giving the impression that your new model got better! After meeting with business stakeholders, we found out that a recent change in the data source introduced spurious observations.
Effective Testing for Machine Learning (Part I)
Update: Part II is out now! This blog post series describes a strategy I've developed over the last couple of years to test Machine Learning projects effectively. Given how uncertain ML projects are, this is an incremental strategy that you can adopt as your project matures; it includes test examples to provide a clear idea of how these tests look in practice, and a complete project implemented with Ploomber is available on GitHub. By the end of the post, you'll be able to develop more robust ML pipelines. Testing Machine Learning projects is challenging. Training a model is a long-running task that may take hours to run and has a non-deterministic output, which is the opposite we need to test software: quick and deterministic procedures.
Effective Testing for Machine Learning (Part I)
This blog post series describes a strategy I've developed over the last couple of years to test Machine Learning projects effectively. Given how uncertain ML projects are, this is an incremental strategy that you can adopt as your project matures; it includes test examples to provide a clear idea of how these tests look in practice, and a complete project implemented with Ploomber is available on GitHub. By the end of the post, you'll be able to develop more robust ML pipelines. Testing Machine Learning projects is challenging. Training a model is a long-running task that may take hours to run and has a non-deterministic output, which is the opposite we need to test software: quick and deterministic procedures. One year ago, I published a post on testing data-intensive projects to make Continuous Integration feasible.
Python 3.9 vs Python 3.10: A Feature Comparison
The decade has seen numerous programming languages being developed and updated to make work easier in the programming domain and create the next Artificial Intelligence (AI) or Machine Learning (ML) system. The traditionally known systems were Java, C#, etc. But as time progressed, among all those programming languages, Python has arrived at the top of the list of favourites majorly due to its ease of use with which developers can handle complex coding challenges using Python. Python is a high level, robust programming language and is mainly focused on rapid application development. Because of the core functionalities present, Python has become one of the fastest-growing programming languages and an obvious choice for programmers developing applications using Python on machine learning, AI, big data, and IoT.
A Must-Have Tool for Every Data Scientist
Let's face it; training a machine learning model is time-consuming. Even with the advancement in computing prowess over the past few years, training machine learning models takes a lot of time. Even the most trivial models have more than a million parameters. On a bigger scale, these models have over a billion parameters(GPT-3 has over 175 billion parameters!), and training these models takes days, if not weeks. As a Data Scientist, we would want to keep an eye on the model's metrics to know if the model performs as per expectations.
Machine Learning Algorithms. Here's the End-to-End.
While there are several documents and articles on machine learning algorithms, I wanted to provide a summary of the most common ones I use as a professional data scientist. Additionally, I will include some sample code with dummy data so that you can start executing various models! Whereas unsupervised learning, like the commonly used K-means algorithm, aims to groups similar groups of data together without labels, supervised learning, or classification -- well, classifies data into various categories. A simple example of classification is described below. The classification model learns from the features about the fruits to suggest an input food a fruit label.